<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2//EN">
<html>
<head>
<meta name="generator" content="HTML Tidy, see www.w3.org">
<title>Jena2 reification API proposal</title>
<link href="styles/doc.css" rel="stylesheet" type="text/css">
</head>
<body>
<h1>Jena2 reification API proposal</h1>



<ul>
<li><strong>version</strong>: 1.0</li>

<li><strong>author</strong>: Chris Dollin, material from Brian
McBride</li>

<li><strong>date</strong>: March 31st 2003</li>
</ul>



<p><br>
<a href="#here1">1 introduction</a><br>
 &nbsp; <a href="#here2">1.1 status</a><br>
 &nbsp; <a href="#here3">1.2 context</a><br>
<a href="#here4">2 presentation API</a><br>
 &nbsp; <a href="#here5">2.1 retrieval</a><br>
 &nbsp; <a href="#here6">2.2 creation</a><br>
 &nbsp; <a href="#here7">2.3 equality</a><br>
 &nbsp; <a href="#here8">2.4 isReified</a><br>
 &nbsp; <a href="#here9">2.5 fetching</a><br>
 &nbsp; <a href="#here10">2.6 listing</a><br>
 &nbsp; <a href="#here11">2.7 removal</a><br>
 &nbsp; <a href="#here12">2.8 input and output</a><br>
<a href="#here13">3 performance</a></p>

<h1><a name="here1"></a>1 introduction</h1>

<h2><a name="here2"></a>1.1 status</h2>

<p>This document describes the reification API in Jena2, following
discussions based on the 0.5a document. The essential decision made
during that discussion is that reification triples are captured and
dealt with by the Model transparently and appropriately.</p>

<h2><a name="here3"></a>1.2 context</h2>

<p>The first Jena implementation made some attempt to optimise the
representation of reification. In particular it tried to avoid so
called 'triple bloat', <i>ie</i> requiring four triples to
represent the reification of a statement. The approach taken was to
make a <i>Statement</i> a subclass of <i>Resource</i> so that
properties could be directly attached to statement objects.</p>

<p>There are a number of defects in the Jena 1 approach.</p>

<ul>
<li>Not everyone in the team was bought in to the approach</li>

<li>The <i>.equals()</i> method for <i>Statement</i>s was arguably
wrong and also violated the Java requirements on a
<i>.equals()</i></li>

<li>The implied triples of a reification were not present so could
not be searched for</li>

<li>There was confusion between the optimised representation and
explicit representation of reification using triples</li>

<li>The optimisation did not round trip through RDF/XML using the
the writers and ARP.</li>
</ul>

However, there are some supporters of the approach. They liked:<br>

<ul>
<li>the avoidance of triple bloat</li>

<li>that the extra reifications statements are not there to be
found on queries or ListStatements and do not affect the
<i>size()</i> method.</li>
</ul>

Since Jena was first written the RDFCore WG have clarified the
meaning of a reified statement. Whilst Jena 1 took a reified
statement to denote a statement, RDFCore have decided that a
reified statement denotes an occurrence of a statement, otherwise
called a stating. The Jena 1 <i>.equals()</i> methods for
<i>Statement</i>s is thus inappropriate for comparing reified
statements.<br>

<p>The goal of reification support in the Jena 2 implementation
are:</p>

<ul>
<li>to conform to the revised RDF specifications</li>

<li>to maintain the expections of Jena 1; <i>ie</i> they should
still be able to reify everything without worrying about triple
bloat if they want to</li>

<li>as far as is consistent with 2, to not break existing code, or
at least make it easy to transition old code to Jena 2.</li>

<li>to enable round tripping through RDF/XML and other RDF
representation langauges</li>

<li>enable a complete standard compliant implementation, but not
necessarily as default</li>
</ul>







<h1><a name="here4"></a>2 presentation API</h1>

<p><i>Statement</i> will no longer be a subclass of
<i>Resource</i>. Thus a statement may not be used where a resource
is expected. Instead, a new interface <i>ReifiedStatement</i> will
be defined:</p>

<pre>
public interface ReifiedStatement extends Resource
    {
    public Statement getStatement();
    // could call it a day at that or could duplicate convenience
    // methods from Statement, eg getSubject(), getInt().
    ...
    }
</pre>


The <i>Statement</i> interface will be extended with the following
methods:<br>

<pre>
public interface Statement
    ...
    public ReifiedStatement createReifiedStatement();
    public ReifiedStatement createReifiedStatement(String URI);
/* */
    public boolean isReified();
    public ReifiedStatement getAnyReifiedStatement();
/* */
    public RSIterator listReifiedStatements();
/* */
    public void removeAllReifications();
    ...
</pre>


<i>RSIterator</i> is a new iterator which returns
<i>ReifiedStatement</i>s. It is an extension of
<i>ResourceIterator</i>.<br>

<p>The <i>Model</i> interface will be extended with the following
methods:</p>

<pre>
public interface Model
    ...
    public ReifiedStatement createReifiedStatement(Statement stmt);
    public ReifiedStatement createReifiedStatement(String URI, Statement stmt);
/* */
    public boolean isReified(Statement st);
    public ReifiedStatement getAnyReifiedStatement(Statement stmt);
/* */
    public RSIterator listReifiedStatements();
    public RSIterator listReifiedStatements(Statement stmt);
/* */
    public void removeReifiedStatement(reifiedStatement rs);
    public void removeAllReifications(Statement st);
    ...
</pre>


The methods in <i>Statement</i> are defined to be the obvious calls
of methods in <i>Model</i>. The interaction of those models is
expressed below. Reification operates over statements in the model
which use predicates <strong>rdf:subject</strong>,
<strong>rdf:predicate</strong>, <strong>rdf:object</strong>, and
<strong>rdf:type</strong> with object
<strong>rdf:Statement</strong>.<br>

<p><i>statements with those predicates are, by default,
invisible</i>. They do not appear in calls of
<i>listStatements</i>, <i>contains</i>, or uses of the <i>Query</i>
mechanism. Adding them to the model will not affect <i>size()</i>.
Models that do not hide reification quads will also be
available.</p>

<h2><a name="here5"></a>2.1 retrieval</h2>

<p>The <i>Model::as()</i> mechanism will allow the retrieval of
reified statements.</p>

<blockquote>someResource.as( ReifiedStatement.class )</blockquote>

If <i>someResource</i> has an associated reification quad, then
this will deliver an instance <i>rs</i> of <i>ReifiedStatement</i>
such that <i>rs.getStatement()</i> will be the statement <i>rs</i>
reifies. Otherwise a <i>DoesNotReifyException</i> will be thrown.
(Use the predicate <i>canAs()</i> to test if the conversion is
possible.)<br>

<p>It does not matter how the quad components have arrived in the
model; explicitly asserted or by the <i>create</i> mechanisms
described below. If quad components are removed from the model,
existing <i>ReifiedStatement</i> objects will continue to function,
but conversions using <i>as()</i> will fail.</p>

<h2><a name="here6"></a>2.2 creation</h2>

<p><i>createReifiedStatement(Statement stmt)</i> creates a new
<i>ReifiedStatement</i> object that reifies <i>stmt</i>; the
appropriate quads are inserted into the model. The resulting
resource is a blank node.</p>

<p><i>createReifiedStatement(String URI, Statement stmt)</i>
creates a new <i>ReifiedStatement</i> object that reifies
<i>stmt</i>; the appropriate quads are inserted into the model. The
resulting resource is a <i>Resource</i> with the URI given.</p>

<h2><a name="here7"></a>2.3 equality</h2>

<p>Two reified statements are <i>.equals()</i> iff they reify the
same statement and have <i>.equals()</i> resources. Thus it is
possible for equal <i>Statement</i>s to have unequal
reifications.</p>

<h2><a name="here8"></a>2.4 isReified</h2>

<p><i>isReified(Statement st)</i> is true iff in the <i>Model</i>
of this <i>Statement</i> there is a reification quad for this
<i>Statement</i>. It does not matter if the quad was inserted
piece-by-piece or all at once using a <i>create</i> method.</p>

<h2><a name="here9"></a>2.5 fetching</h2>

<p><i>getAnyReifiedStatement(Statement st)</i> delivers an existing
<i>ReifiedStatement</i> object that reifies <i>st</i>, if there is
one; otherwise it creates a new one. If there are multiple
reifications for <i>st</i>, it is not specified which one will be
returned.</p>

<h2><a name="here10"></a>2.6 listing</h2>

<p><i>listReifiedStatements()</i> will return an <i>RSIterator</i>
which will deliver all the reified statements in the model.</p>

<p><i>listReifiedStatements( Statement st )</i> will return an
<i>RSIterator</i> which will deliver all the reified statements in
the model that reifiy <i>st</i>.</p>

<h2><a name="here11"></a>2.7 removal</h2>

<p><i>removeReifiedStatement(ReifiedStatement rs)</i> will remove
the reification <i>rs</i> from the model by removing the
reification quad. Other reified statements with different resources
will remain.</p>

<p><i>removeAllReifications(Statement st)</i> will remove all the
reifications in this model which reify <i>st</i>.</p>

<h2><a name="here12"></a>2.8 input and output</h2>

<p>The writers will have access to the complete set of
<i>Statement</i>s and will be able to write out the quad
components.</p>

<p>The readers need have no special machinery, but it would be
efficient for them to be able to call <i>createReifiedStatement</i>
when detecting an reification.</p>





<h1><a name="here13"></a>3 performance</h1>

<p>Jena1's "statements as resources" approach avoided triples bloat
by not storing the reification quads. How, then, do we avoid triple
bloat in Jena2?</p>

<p>The underlying machinery is intended to capture the reification
quad components and store them in a form optimised for reification.
In particular, in the case where a statement is completely reified,
it is expected to store only the implementation representation of
the <i>Statement</i>.</p>

<p><i>createReifiedStatement</i> is expected to bypass the
construction and detection of the quad components, so that in the
"usual case" they will never come into existance.</p>

<p>The details of this are described in a companion document.</p>



</body>
</html>

